#+AUTHOR:Joshua Branson
#+TITLE: Plan
#+LATEX_HEADER: \usepackage{lmodern}
#+LATEX_HEADER: \usepackage[QX]{fontenc}
#+OPTIONS: H:10 toc:nil

* Plan

So for now I'm writing a program that assumes the file contains valid html code.  I'm also assuming that there are no comments. With that assumption...actually couldn't I use bison or something to generate code that would parse this for me?

Probably.


Anyway, how would I go about parsing an html file?  That's a really good question.  I'd have to keep a record of what elements I am currently parsing... For example

#+BEGIN_SRC sh :results output :exports both
cat simple.html
#+END_SRC

#+RESULTS:
: <!DOCTYPE html>
: <html lang="en">
:     <head>
:         <title>Bootstrap 101 Template</title>
:     </head>
:     <body>
:         <h1>Hello, world!</h1>
:     </body>
: </html>

When getc() gives me the "B" in Bootstrap, then at that point my data structures should look like the following:

element elements [] = 

0 -> element 
       name = "!DOCTYPE html" 
       attribute.name = html
       attribute.contents = "" 
       older_sibling = 0;
       younger_sibling = 0;
       child = 0;
       done_parsing = true;
       
1 -> element
     name = "head" 
     contents = ?  don't know yet.  Not done parsing
     older_sibling = 0;
     younger_sibling = 0;
     done_parsing = false;
     child = elementptr -> element 
               element.name = title
               element.contents = ? don't know yet.  Not done parsing. 
               element.older_sibling = 0
               element.younger_sibling = ? don't know yet not done parsing
               element.done_parsing = false;

               
#+BEGIN_SRC python
  while 1:
      switch (c):
      case "<"
      parse_top_element ()
      case ">"
      parse_bottom_element ()
#+END_SRC

#+BEGIN_SRC python
  def parse_top_element ():
      while ((c = getc()) != ">"):
          string += c
#+END_SRC

** parsing issues  I can't/shouldn't return an array from a function...

https://stackoverflow.com/questions/11656532/returning-an-array-using-c 

I'll have to dynamically increase the size of the array inside the function.
** returning a string from a function

https://stackoverflow.com/questions/25798977/returning-string-from-c-function

I should probably be allocating these string via malloc
** dynamically allocate 2D array
https://www.geeksforgeeks.org/dynamically-allocate-2d-array-c/
** an example html data structure

#+BEGIN_HTML
<html>
  <body>
      <div>
        <p> Hello <span> World! </span> <em> What's happening?</em> </p>
      <div>
      <div>
         <div>
           <div>
             <p> cra cra How are you? </p>
             <br/>
           </div>
         </div>
      </div>
      <div>
        <p> What's going on here!? </p>
        <br/>
        <p> Hello </p>
      </div>
  </body>
</html>
#+END_HTML

If we reach a closing html element...
  element->done_parsing = true;
  element = element->parent_element;  
